18 research outputs found
A Partially Linear Censored Quantile Regression Model for Unemployment Duration
Censored Regression Quantile (CRQ) methods provide a powerful and flexible approach for the analysis of censored survival data when standard linear models are felt to be appropriate. In many cases however, greater flexibility is desired to go beyond the usual multiple regression paradigm. One area of common interest is that of partially linear models, where one (or more) of the explanatory variables are assumed to act on the response through a non-linear function. Here the CRQ approach (Portnoy, 2003) is extended to such partially linear setting. Basic consistency results are presented. A simulation experiment and analysis of unemployment data example justify the use of the partially linear approach over methods based on the Cox proportional hazards regression model and methods not permitting nonlinearity.quantile regression ; partially linear models ; B-splines ; censored data ; unemployment duration
A multivariate Bayesian learning approach for improved detection of doping in athletes using urinary steroid profiles
Biomarker analysis of athletes' urinary steroid profiles is crucial for the
success of anti-doping efforts. Current statistical analysis methods generate
personalised limits for each athlete based on univariate modelling of
longitudinal biomarker values from the urinary steroid profile. However,
simultaneous modelling of multiple biomarkers has the potential to further
enhance abnormality detection. In this study, we propose a multivariate
Bayesian adaptive model for longitudinal data analysis, which extends the
established single-biomarker model in forensic toxicology. The proposed
approach employs Markov chain Monte Carlo sampling methods and addresses the
scarcity of confirmed abnormal values through a one-class classification
algorithm. By adapting decision boundaries as new measurements are obtained,
the model provides robust and personalised detection thresholds for each
athlete. We tested the proposed approach on a database of 229 athletes which
includes longitudinal steroid profiles classified as normal, atypical, or
confirmed abnormal. Our results demonstrate improved detection performance,
highlighting the potential value of a multivariate approach in doping
detection.Comment: 25 pages, main manuscript pgs. 1-19, appendix A pgs. 19-22, appendix
B pgs. 23-2
On monotonicity of regression quantile functions
In the linear regression quantile model, the conditional quantile of the response, Y, given x is QY|x(Ļ)ā”xā²Ī²(Ļ). Though QY|x(Ļ) must be monotonically increasing, the KoenkerāBassett regression quantile estimator, View the MathML source, is not monotonic outside a vanishingly small neighborhood of View the MathML source. Given a grid of mesh Ī“n, let View the MathML source be the linear interpolation of the values of View the MathML source along the grid. We show here that for a range of rates, Ī“n, View the MathML source will be strictly monotonic (with probability tending to one) and will be asymptotically equivalent to View the MathML source in the sense that n1/2 times the difference tends to zero at a rate depending on Ī“n
Retrospective sampling in MCMC with an application to COM-Poisson regression
The normalization constant in the distribution of a discrete random variable may not be available in closed form; in such cases, the calculation of the likelihood can be computationally expensive. Approximations of the likelihood or approximate Bayesian computation methods can be used; but the resulting Markov chain Monte Carlo (MCMC) algorithm may not sample from the target of interest. In certain situations, one can efficiently compute lower and upper bounds on the likelihood. As a result, the target density and the acceptance probability of the MetropolisāHastings algorithm can be bounded. We propose an efficient and exact MCMC algorithm based on the idea of retrospective sampling. This procedure can be applied to a number of discrete distributions, one of which is the ConwayāMaxwellāPoisson distribution. In practice, the bounds on the acceptance probability do not need to be particularly tight in order to accept or reject a move. We demonstrate this method using data on the emergency hospital admissions in Scotland in 2010, where the main interest lies in the estimation of the variability of admissions, as it is considered as a proxy for health inequalities
An online application for the classification and evidence evaluation of forensic glass fragments
We present an easy-to-use and freely accessible online application for the analysis of forensic glass fragments. The application is browser based and takes as input .csv or .txt files containing measurements from glass fragments obtained using a scanning electron microscope with an energy-dispersive X-ray (SEM-EDX) spectrometer. The application was developed to (i) classify glass fragments into use-type categories (classification), and (ii) compute the evidential strength of two sets of fragments under competing propositions (evidence evaluation). Detailed examples of how to use the application for both tasks are given, which highlight its user-friendly interface. The suitability of the statistical methods used by the application was checked using simulation studies, and improvements upon previous methods were found in both tasks
Partially linear censored quantile regression
Censored regression quantile (CRQ) methods provide a powerful and flexible approach to the analysis of censored survival data when standard linear models are felt to be appropriate. In many cases however, greater flexibility is desired to go beyond the usual multiple regression paradigm. One area of common interest is that of partially linear models: one (or more) of the explanatory covariates are assumed to act on the response through a non-linear function. Here the CRQ approach of Portnoy (J Am Stat Assoc 98:1001ā1012, 2003) is extended to this partially linear setting. Basic consistency results are presented. A simulation experiment and unemployment example justify the value of the partially linear approach over methods based on the Cox proportional hazards model and on methods not permitting nonlinearity
Evaluation of forensic data using logistic regression-based classification methods and an R Shiny implementation
We demonstrate the use of classification methods that are well-suited for forensic toxicology applications. The methods are based on penalized logistic regression, can be employed when separation occurs in a two-class classification setting, and allow for the calculation of likelihood ratios. A case study of this framework is demonstrated on alcohol biomarker data for classifying chronic alcohol drinkers. The approach can be extended to applications in the fields of analytical and forensic chemistry, where it is a common feature to have a large number of biomarkers, and allows for flexibility in model assumptions such as multivariate normality. While some penalized regression methods have been introduced previously in forensic applications, our study is meant to encourage practitioners to use these powerful methods more widely. As such, based upon our proof-of-concept studies, we also introduce an R Shiny online tool with an intuitive interface able to perform several classification methods. We anticipate that this open-source and free-of-charge application will provide a powerful and dynamic tool to infer the LR value in case of classification tasks
Transformations for compositional data with zeros with an application to forensic evidence evaluation
In forensic science likelihood ratios provide a natural way of computing the value of evidence under competing propositions such as "the compared samples have originated from the same object" (prosecution) and "the compared samples have originated from different objects" (defence). We use a two-level multivariate likelihood ratio model for comparison of forensic glass evidence in the form of elemental composition data under three data transformations: the logratio transformation, a complementary log-log type transformation and a hyperspherical transformation. The performances of the three transformations in the evaluation of evidence are assessed in simulation experiments through use of the proportions of false negatives and false positives